Structural Maximum a Posteriori Adaptation for Mixture Stochastic Trajectory Framework

نویسندگان

Irina Illina

Djamel Mostefa

چکیده

In this paper we address the problem of the adaptation of a speech recognition system to a new environment. The aim of adaptation is to compensate the mismatch between training and testing conditions without retraining completely the recognition system. The questions are what has to be compensated and how? We propose to compensate the means and variances of the Gaussian pdfs, representing the acoustic models, using the linear transformations and ML and MAP estimations. To better take into account the variability of the adaptation data, the pdfs of models are organised in a tree. This tree structure is used also for the definition of prior densities of transformations. The approach is called Structural Maximum a Posteriori adaptation (SMAP). SMAP is developed for a segment-based model, the Mixture Stochastic Trajectory Model (MSTM). Experimental results on RM task for supervised speaker adaptation show that SMAP significantly outperforms the MLLR adaptation for the same amount of adaptation data and the same number of transformation parameters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tree-structured Maximum a Po for a Segment-based Speech R

In this paper, the problem of the adaptation of a speech recognition system to a new environment is addressed. Recently, a Structural Maximum a Posteriori adaptation (SMAP) for a frame-based HMM model adaptation has been developed. In this method, acoustic model pdfs are organised in a tree and the means and variances of the pdfs are adapted using the linear transformations estimated under MAP ...

متن کامل

Maximum a posteriori adaptation for many-to-one eigenvoice conversion

Many-to-one eigenvoice conversion (EVC) allows the conversion from an arbitrary speaker’s voice into the pre-determined target speaker’s voice. In this method, a canonical eigenvoice Gaussian mixture model is effectively adapted to any source speaker using only a few utterances as the adaptation data. In this paper, we propose a many-to-one EVC based on maximum a posteriori (MAP) adaptation for...

متن کامل

Modeling Long Term Variability Information in Mixture Stochastic Trajectory Framework

The problem of acoustic modeling for speech recognizers is addressed. We distinguish two types of speech variability, long term (speaker identity, stationary noise, channel distortion) and short term (phoneme class). Currently, most recognizers model the two variabilities without considering their specificities, which may result in flat distributions with limited discriminability. In our system...

متن کامل

Modelling long term variability information in mixture stochastic trajectory framework

متن کامل

Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition

Gaussian mixture model support vector machine (GMMSVM) with nuisance attribute projection (NAP) has been found to be effective and reliable for speaker and language recognition. In maximum a posteriori (MAP) adaptation of GMM, the relevance factor is the parameter that regulates how much the adaptation data affect the base model, which impacts the final recognition performance. In our previous ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Structural Maximum a Posteriori Adaptation for Mixture Stochastic Trajectory Framework

نویسندگان

چکیده

منابع مشابه

Tree-structured Maximum a Po for a Segment-based Speech R

Maximum a posteriori adaptation for many-to-one eigenvoice conversion

Modeling Long Term Variability Information in Mixture Stochastic Trajectory Framework

Modelling long term variability information in mixture stochastic trajectory framework

Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition

عنوان ژورنال:

اشتراک گذاری